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This  report  was  prepared  in  the  Department  of  Mathematical  Statistics, 
Columbia  University,  New  York,  by — 


T.  W.  ANDERSON,  Ph.D. 


ABSTRACT 


The  function  of  factor  analysis  in  time  series  analysis  is  discussed.  The  relation¬ 
ships  between  the  models  in  the  two  approaches  are  observed,  with  distinction  beinff 
drawn  between  exogenous  and  endogenous  variables.  Various  approaches  are  suggested 
for  analyzing  multiple  time  series  by  factor  analysis. 


This  technical  documentary  report  has  been  reviewed  and  is  approved. 
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THE  USE  OF  FAGOR  ANALYSIS  IN  THE  STATISTICAL  ANALYSIS 
OF  MULTIPLE  TIME  SERIES 


1.  INTRODUCTION 

Factor  analysis  consists  of  a  set  of  statis¬ 
tical  technics,  with  related  mathematical 
models,  that  has  been  developed  particularly 
for  analyzing  the  observed  scores  of  many  in¬ 
dividuals  on  a  lartje  number  of  intelligence 
tests.  These  methods  have  been  applied  to 
other  .sets  of  measurements  of  individuals  and 
of  other  entities.  The  u.se  has  been  further 
extended  to  analyze  a  set  of  variables  that 
refer  to  one  individual  and  that  are  observed 
on  .several  occasions.  This  is  the  use  of  factor 
analysis  in  the  statistical  analysis  of  multiple 
time  .series. 

Factor  analysis  technics  have  been  used 
largely  by  psycholoj^ists,  and  their  applications 
to  time  aeries  data  have  been  mostly  at  the 
hands  of  psycholoj^iats.  (!attell  (4,  6)  particu¬ 
larly  has  discussed  this  u.se  of  the  technic 
which  he  calls  “the  P-technique,”  The  present 
paper  aro.se  from  comments  made  by  the  author 
at  a  symposium  apon.sored  by  the  American 
Psycholofrical  Aa.sociation,'  which  included  a 
general  survey  of  this  u.se  of  factor  analysis  by 
('attell  (6)  ;  a  study  of  psycholoKic,  p.sychiatric, 
physiologic,  and  chemical  measurements  of  a 
schizophrenic  patient  by  Mefferd  et  al.  (13); 
and  a  study  of  bicK'hemical  characteristics  of  a 
cancel’  patient  by  Damarin  (7).  A  summary 
of  the  two  latter  studies  will  be  friven  in  .sec¬ 
tion  2  as  examples  of  the  use  of  factor  analysis 
in  the  .statistical  analysis  of  multiple  time 
.series.  The  jreneral  problem  will  be  .stated 
formally  (.section  3),  and  a  mathematical  .state¬ 
ment  of  factor  analysis  will  be  jriven  (sec¬ 
tion  4). 

'1*'\  n'|ui'~Mini  •  ■■  M II !t i \ ;i «»f  ri'itcaltii 
tin  tin'  -anil'  indiN  I'liinl.”  Wa'.hiiurlnii.  IMvisinn  a  of  tin*  .-XnirM- 

v'.ili  l*s>  rln»l«»y  it'H I  Assiu'iat  Ion.  3  S«’|it  I'.ift** 


The  primary  purpose  of  this  paper  is  to 
outline  the  possible  functions  of  factor  analysis 
in  time  series  analysis  (section  5),  to  discuss 
some  of  the  problems  and  difficulties  that 
ari.se,  and  to  point  out  limitations  on  its  useful¬ 
ness  (.sections  6  and  7).  The  paper  is  directed 
especially  to  psycholofrists  as  they  seem  to  be 
favorably  disposed  to  usinj?  factor  analytic 
technics  on  data  in  the  form  of  time  series. 
Economists,  amontr  others,  have  been  analyzinK 
time  .series  for  many  decades.  Although  .some 
of  the  technics  they  apply  are  quite  specialized, 
they  have  also  considered  technics  ba.sed  on  a 
mathematical  model  quite  similar  to  that  of 
factor  analysis.  Such  technics,  however,  have 
been  largely  displaced  by  other  methods  ba.sed 
on  different  models  such  as  stochastic  dif¬ 
ference  equations.  A  brief  description  of  the.se 
methods  is  jriven  in  .sections  8  and  9.  Although 
.some  aspects  of  the  analysis  of  repeated 
measurements  of  psycholotjic  and  physiolojric 
measurements  are  rather  unique,  it  .should  be 
noted  that  there  are  many  aspects  common  to 
time  series  analysis  as  developed  in  other 
di.sciplines. 

2.  TWO  EXAMPLES  OF  THE  FACTOR 
ANALYSIS  OF  TIME  SERIES  DATA 

Two  substantive  studies  were  reported  to 
the  .symposium  which  lead  to  the  preparation 
of  this  paper.  In  a  .study  by  Mefferd  et  al. 
(13),  a  .schizophrenic  patient  was  ob.served  over 
a  period  of  261  days  during  which  t  ime  several 
types  of  shock  treatment  were  }?iven.  Daily 
measurements  of  the  patient’s  characteristics 
included  30  chemical  con.stituents  of  urine  and 
6  physioloffic  quantities  (body  weight,  tempera¬ 
ture,  pul.se  and  respiration  rates,  and  systolic 
and  diastolic  blood  pressures).  The  daily  .scores 
on  a  p.sychiatric  ratiiiK  scale  were  considered 
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as  not  sufficiently  variable  to  include  in  the 
analysis.  During  the  last  75  days  daily  scores 
were  obtained  on  alternate  forma  of  6  psycho¬ 
logic  tests  (copying,  number  facility,  four-letter 
words,  digit  span,  and  two  time  per¬ 
ceptions).  During  this  latter  period  whether 
or  not  the  treatment  was  given  was  indicated 
by  a  coded  variable.  Four  environmental  varia¬ 
bles  were  maximum  outside  temperature, 
average  outside  temperature,  average  wind 
speed,  and  sunspot  activity.  The  variables 
subjected  to  analysis  also  included  the  day 
number  (furnishing  a  linear  trend)  and  three 
other  synthetic  (or  dummy)  variables  to  pro¬ 
vide  cycles  of  2  and  4  days.  The  correlations 
between  these  51  variables  were  computed  over 
the  last  75  days,  during  which  time  only  one 
type  of  treatment  was  given.  The  factor 
analysis  of  these  correlations  yielded  8  or¬ 
thogonal  factors,  which  were  then  rotated.  The 
rotated  factors  were  then  “identified”  by  the 
values  of  the  loadings  of  the  observed  variables. 
Another  factor  analysis  was  carried  out  over 
an  earlier  period  when  a  different  treatment 
was  given,  and  a  third  factor  analysis  was  made 
on  the  data  in  the  two  periods. 

The  other  study  was  made  of  a  patient  with 
advanced  cancer  of  the  prostate  (7).  During 
and  after  hormonal  therapy,  measurements 
were  made  on  Mondays,  Wednesdays,  and  Fri¬ 
days  over  a  period  of  about  63  weeks.  The 
rate  of  tumor  growth  in  the  prostate  was  in¬ 
dicated  by  the  titer  of  an  enzyme  (acid  phos¬ 
phatase)  ,  and  the  rate  of  repair  of  bony  tissue 
damaged  by  metastatism  was  indicated  by  the 
titer  of  another  enzyme  (alkaline  phosphatase). 
Other  physiologic,  biochemical,  and  metabolic 
variables  (including  strength  of  medication  for 
pain)  were  also  included,  as  well  as  climatic 
variables  and  two  synthetic  variables  to  repre¬ 
sent  a  linear  trend  and  a  weekly  cycle.  The 
period  of  study  between  initial  therapy  with 
estrogen  and  death  was  divided  into  three 
stages:  improvement,  remission,  and  relapse, 
with  measurements  on  63  occasions  in  each 
period  and  on  37,  44,  and  47  variables  in  the 
three  periods,  respectively.  A  factor  analysis 
of  the  matrix  of  correlations  over  the  63  time 
points  in  each  phase  yielded  16  oblique  factors. 
Within  each  phase  the  16  factors  were  identi¬ 


fied.  Of  these,  4  identified  factors  seemed  to 
be  common  to  the  three  phases,  5  factors 
appeared  in  pairs  of  phases,  and  the  other 
7  factors  were  distinct  in  each  phase.  The 
investigator  related  the  appearance  and  dis¬ 
appearance  of  factors  over  the  three  stages  to 
the  course  of  the  disease. 


3.  DATA  AND  GOALS  OF  TIME  SERIES 
ANALYSIS 


To  state  our  problem  in  general  terms,  we 
suppose  that  some  p  measurements  (yit,  yat, 
.  .  .  ,  y,,t)  are  made  on  a  given  individual  at 
each  of  T  successive  times  (t  =  1,  2,  .  .  .  ,  T). 
The  set  of  data  on  the  individual  constitutes  a 
p  X  T  matrix 


(1) 


We  shall  suppose  the  successive  times  are 
equally  spaced,  such  as  measurements  made  on 
successive  days.  (Equal  spacing  is  advanta¬ 
geous  to  benefit  by  the  time  scale.)  For  ex¬ 
ample,  in  one  case  reported  by  Mefferd  et  al. 
(13),  y24,t  is  the  score  on  the  test  of  number 
facility  on  the  t-th  day  and  y4o,t  is  the  systolic 
blood  pressure  on  the  t-th  day.  These  p  vari¬ 
ables  characterize  the  individual  in  such  a  study 
and  are,  therefore,  of  primary  concern  to  the 
investigator. 


There  may  be  a  related  set  of  q  variables 
that  we  may  term  environmental,  which  are 
also  measured  at  each  time  point  t.  These  form 
another  matrix 

(2) 

/*II  *12  ■  •  •  *IT^ 

/  *21  *22  •  ■  •  *2T  \ 


Some  are  strictly  environmental,  such  as  aver¬ 
age  daily  temperature  and  wind  speed;  others 
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may  describe  a  treatment  given  an  individual; 
and  some  may  be  synthetic  or  dummy  variables 
such  as  time  trend  (that  is,  zj,  =:  t)  and  cyclic 
or  periodic  variables  (such  as  Zu  =  (  — 1)‘). 
This  second  set  of  variables  describes  the  con¬ 
ditions  acting  on  the  individual. 

Some  economists  have  used  the  terms  en¬ 
dogenous  and  exogenous  for  the  corresponding 
sets  of  time  series  in  economic  studies.  Cattell 
(5)  uses  the  terms  endogenous  (spontaneous) 
and  exogenous  (situational).  The  endogenous 
variables  form  the  set  that  the  system  is  sup¬ 
posed  to  generate;  we  are  trying  to  “explain” 
these.  For  example,  in  an  investigation  of  the 
United  States  economy,  national  income  might 
be  an  endogenous  variable.  The  exogenous 
variables  are  the  set  that  acts  on  the  system; 
these  are  taken  as  given.  For  instance,  the 
size  of  the  population  of  the  United  States 
might  be  taken  as  exogenous. 

This  classification  is  not  absolutely  clean- 
cut.  Some  variables  may  describe  the  treat¬ 
ment  given  a  patient,  such  as  the  amount  of  a 
drug.  The  treatment  acts  on  the  individual 
and  in  that  sense  is  environmental.  However, 
if  during  the  period  of  study  the  course  of  treat¬ 
ment  is  modified  because  of  the  development 
of  the  variables  characterizing  the  individual, 
these  treatment  variables  can  hardly  be  re¬ 
garded  as  exogenous.  In  such  cases,  the  se¬ 
quence  of  values  of  the  treatment  variables 
cannot  be  taken  as  given;  it  becomes  part  of 
the  system  under  study. 

The  classification  of  dummy  variables  may 
also  be  questionable.  In  an  economic  study  a 
combination  of  trend  terms  such  as  Zjt  =  t  is 
usually  taken  to  represent  a  collection  of  varia¬ 
bles  not  treated  explicitly,  such  as  measures  of 
technologic  development ;  it  is  supposed  that  all 
of  the  variables  characterizing  the  system  are 
adequately  represented  by  the  endogenous  varia¬ 
bles  in  the  study  and,  hence,  that  the  synthetic 
variables  are  in  lieu  of  unspecified  exogenous 
variables.  In  any  case,  of  course,  it  is  clear 
that  such  variables  are  exogenous  in  the  sense 
that  they  are  determined  outside  the  system 
studied  and  must  be  taken  as  given.  It  is  the 
interpretation  that  may  be  difficult ;  in  psycho¬ 
logic  and  physiologic  studies  of  individuals  it 


may  not  be  clear  whether  a  trend  term  repre¬ 
sents  environmental  conditions,  internal  quan¬ 
tities,  or  both. 

A  modest  goal  of  time  series  analysis  is  to 
give  a  succinct  description  of  the  variables  of 
interest  as  observed  over  the  assigned  period 
of  time.  Data  on  one  variable  at  one  time  for 
several  individuals  are  often  summarized  by 
quantities  like  the  mean  and  variance.  The 
summarization  of  data  on  several  variables 
may  include  correlation  coefficients.  Data  in 
time  may  involve  other  modes  of  summariza¬ 
tion  because  observations  in  a  time  series  are 
ordered.  Some  such  summarization  statistics 
are  serial  correlation  coefficients  (also  termed 
auto  correlations  or  lag  correlations).  For 
example,  the  first-order  serial  correlation  co¬ 
efficient  is  the  correlation  between  adjacent 
elements  in  a  time  series — that  is,  the  pair  x,, 
Xt-1  replaces  y,,  z,  in  the  computation. 

Another  goal  of  the  study  of  a  time  series 
is  prediction.  Given  the  sequence  of  one  or 
several  variables,  the  investigator  wants  to  say 
something  about  the  value  of  a  variable  anti¬ 
cipated  at  some  point  in  the  future.  The 
economist  may  want  to  predict  national  income 
next  year ;  the  psychologist  may  want  to  predict 
the  degree  of  elation  of  a  patient  tomorrow. 
Prediction  may  be  done  on  an  ad  hoc  basis,  or 
it  may  involve  a  deeper  analysis. 

A  more  encompassing  goal  of  time  series 
analysis  is  an  understanding  of  the  mechanism 
generating  the  characteristics  of  interest;  in 
other  terms,  one  wants  to  “explain”  the  se¬ 
quence  of  observed  quantities.  This  is  usually 
the  goal  of  scientific  inquiry.  A  mathematical- 
statistical  model  to  specify  the  formation  of  the 
variables  under  given  conditions  is  to  be  desired. 

In  some  cases  an  explanation  or  determina¬ 
tion  of  a  time  series  of  a  single  variable  can  be 
given  in  terms  of  itself.  For  instance,  the 
height  of  a  swinging  pendulum  at  each  point  in 
time  is  a  function  (trigonometric)  of  time  which 
satisfies  a  second-order  differential  equation; 
that  is,  its  position  at  a  future  time  is  deter¬ 
mined  by  its  present  position,  velocity,  and 
acceleration  (which  are  equivalent  to  the  infor¬ 
mation  given  by  its  position  at  three  different 
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times).  It  may  be  that  time  series  of  some 
psychologic  variable  might  be  explained  by  that 
series  alone;  perhaps  a  characteristic  of  mood 
is  at  least  roughly  explained  by  its  earlier 
fluctuations. 

In  other  cases  several  variables  are  tied 
together  and  the  entire  complex  may  be  ex¬ 
plained  in  terms  of  its  earlier  history.  Given  a 
stable  environment,  a  certain  set  of  psychologic 
or  physiologic  quantities  may  interact  in  a  sys¬ 
tem,  the  development  of  one  quantity  affecting 
another.  In  such  cases  the  investigator  seeks 
a  mechanism  explaining  the  entire  constellation. 

Often  there  are  external  factors  that  influ¬ 
ence  the  individual  characteristics.  Sometimes 
a  time  series  analysis  may  only  be  relating  the 
variable  of  interest  to  the  changing  environ¬ 
mental  condition.  The  most  complex  system 
includes  exogenous  variables  which  affect  the 
set  of  endogenous  variables  which  are  them¬ 
selves  interacting. 

4.  FACTOR  ANALYSIS 

Factor  analysis  can  be  considered  as  the 
analysis  of  data  in  terms  of  a  particular  mathe¬ 
matical  model.  We  shall  state  the  model 
mathematically  for  a  general  set  of  variables 
Xit,  i  =  1,  .  .  .  h,  t  =  1,  .  .  . ,  T.  (The  variables 
noted  in  section  3  fit  into  this  framework  with 
Xu  —  yii,  i  =  1,  .  .  .  ,  p,  and  x,,+j, ,  =  Zji,  j  =  1, 
.  .  .  ,  q,  with  h  =  p  q.)  The  factor  analysis 
model  is 


^it  “  Mj  +  —  ^10^01* 

a  t 

The  iit  are  the  “systematic  parts”  of  the  X|,. 
The  U|,  are  “errors”;  each  is  composed  of  a 
“specific  factor,”  which  is  special  to  the  varia¬ 
ble  (that  is,  i)  and  an  error  of  measurement. 
The  fi,  are  constants ;  the  Aia  are  common  factor 
loadings ;  and  the  fat  are  common  factor  scores. 
The  basic  (unobservable)  variables  (Uu,  . . .  , 
Uiit,  fiti  •  •  •  t  ftiii  ),  t  1,  ....  T,  are  considered 
to  be  T  independent  sets  of  observations  from 
some  population;  models  3  and  4  indicate  how 


these  are  combined  to  form  the  observable 
variables.  Sometimes  the  set  of  factor  scores 
fat  for  T  particular  individuals  or  times  are 
considered  as  fixed  or  nonrandom  variables, 
called  incidental  parameters,  while  the  errors 
Uii  are  treated  as  random  observations.  The  es¬ 
sential  mathematical  assumptions  are  that  the 
errors  Ui,  are  uncorrelated  and  that  the  factors 
fot  are  uncorrelated  with  the  errors  Ui,.  The 
population  mean  values  of  the  errors  and 
factors  can  be  taken  as  0,  and  then  the  popula¬ 
tion  mean  value  of  X;,  is  jui.  Except  for  special 
choices  of  the  factor  loadings  Aia,  the  observed 
variables  are  correlated  and  the  correlation  is 
due  to  the  common  factors. 

The  information  in  the  observations  is  sum¬ 
marized  by  the  sample  means, 

x,  - 

sample  variances 

S||  =:  ::,(X|,  —  x,)-7(T  —  1) 
and  sample  covariances 

s,j  =  —  Xj)/(T  -  1). 

If  the  factor  scores  and  errors  are  normally 
distributed,  these  quantities  form  a  sufficient 
set  of  statistics,  the  sample  covariance  matrix 
is  “factored”  to  obtain  estimates  of  the  factor 
loadings.  This  involves  approximating  the  ob¬ 
served  covariance  matrix  (S|j)  by  an  estimated 
theoretic  covariance  matrix  s  a*  A*  -f  a*-  so 
the  difference  between  these  two  matrices 
consists  of  elements  close  to  0.  Usually  the 
procedure  includes  the  determination  of  m  (the 
number  of  factors)  from  the  sample.  Since 
there  is  indeterminacy  in  th^  specification  of 
the  loadings  Aia  and  factor  scores  fat,  the  factor¬ 
ing  yields  a  set  of  estimates,  and  one  of  these 
is  selected  (for  example,  by  “rotation”  to  sim¬ 
ple  structure)  to  facilitate  the  interpretation  of 
the  factors. 

Given  the  factor  loadings  one  can  estimate 
the  factor  scores  fat  corresponding  to  observed 
variables  Xi,  for  each  t.  The  estimated  factor 
score  is  usually  a  linear  function  of  the  observed 
Xit  for  that  t.  If  the  factor  scores  and  errors 
are  normally  distributed,  the  linear  functions 
may  be  the  regression  of  the  factor  scores  on 
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the  observed  variables ;  if  the  factor  scores  are 
taken  as  incidental  parameters,  the  estimated 
factor  scores  can  be  maximum  likelihood  esti¬ 
mates  under  the  assumptions  that  the  loadings 
are  given  and  the  errors  are  normally  distrib¬ 
uted. 

There  are  different  statistical  methods  of 
carrying  out  the  determination  of  m  and  esti¬ 
mating  the  loadings  and  factor  scores.  Fre¬ 
quently,  the  covariances  are  converted  to 
correlations  and  the  centroid  method  of  factor¬ 
ing  is  applied.  A  survey  and  analysis  of  the 
different  technics  from  the  mathematical- 
statistical  point  of  view  has  been  given  else¬ 
where  (3).  The  interest  in  this  paper  is  the 
usefulness  of  any  factor  analysis  methods  in 
the  statistical  analysis  of  multiple  time  series. 

5.  THREE  POSSIBLE  FUNCTIONS  OF 

FACTOR  ANALYSIS  IN  THE  STUDY 
OF  TIME  SERIES 

Measurement  of  underlying  traits 

Factor  analysis  in  the  sense  of  an  analysis 
of  the  model  presented  in  section  4  has  been 
developed  primarily  as  a  means  of  ascertaining 
underlying  traits,  or  latent  variables  (the 
factors) ;  these  are  characteristics  of  individ¬ 
uals  that  cannot  be  measured  directly  but  that 
do  affect  suitably  chosen  variables  that  can 
be  measured  or  observed.  The  model  is  widely 
accepted  as  adequately  representing  reality  in 
certain  areas  of  psychology.  This  is  particu¬ 
larly  true  in  the  case  where  the  observed 
variables  are  scores  on  various  kinds  of  intel¬ 
ligence  tests  and  the  unobserved  variables 
are  factors  representing  mental  abilities. 
In  such  ca.ses,  the  observed  data  together  with 
a  statistical  analysis  appropriate  to  the  model 
may  yield  measurements  of  the  underlying 
traits.  It  might  be  pointed  out  that  in  using 
this  approach  to  evaluate  underlying  traits,  the 
psychologist  designs  and  chooses  the  instru¬ 
ments  (tests,  questionnaires,  etc.)  to  tap  the 
traits  in  which  he  is  interested. 

The  estimate  of  a  factor  score  or  measure¬ 
ment  of  a  trait  of  a  given  individual  at  a  given 
time  comes  out  as  a  linear  combination  of  the 
observed  “test”  scores  of  that  individual  at  that 


time.  A  time  sequence  of  the  factor  scores  of 
one  individual  can  be  estimated  by  taking  this 
linear  combination  of  the  test  scores  observed 
at  a  sequence  of  time  points.  This  time  series 
of  estimated  factor  scores  can  be  analyzed  by 
itself,  or  it  can  be  included  in  an  analysis  with 
other  time  series.  This  use  of  a  combination  of 
other  variables  is  analogous  to  the  economist’s 
use  of  an  index,  such  as  a  price  index.  As  is 
known  to  econometricians,  the  use  of  an  index 
poses  its  own  statistical  problems,  but  this  is 
not  the  place  for  a  discussion  of  the  indices 
used  in  economics. 

The  selection  of  the  traits,  or  factors,  and 
the  determination  of  the  linear  combinations  of 
observations  to  estimate  them  can  be  made 
from  repeated  observations  on  one  individual, 
or  from  observations  at  one  time  on  many  in¬ 
dividuals.  If  certain  specified  linear  combina¬ 
tions  of  scores  of  certain  specified  tests  are 
accepted  a  priori  as  measuring  certain  traits,  it 
would  seem  reasonable  to  use  such  measures  in 
a  time  series  analysis.  Of  course,  this  recom¬ 
mendation  is  qualified  by  the  requirement  that 
the  instruments  are  not  substantially  affected 
by  time  (practice,  fatigue,  etc.),  but  this  diffi¬ 
culty  will  arise  with  any  repeated  use  of  such 
instruments.  Cattell  (4,  p.  896)  has  suggested 
the  use  of  “marker  variables”  in  studies  involv¬ 
ing  factor  analysis;  these  variables  are  tests 
which  are  taken  to  have  high  loadings  on  cer¬ 
tain  desired  factors  and,  hence,  assist  in  obtain¬ 
ing  measurements  of  specified  factors. 

A  set  of  tests  or  other  instruments  ob¬ 
served  for  one  individual  over  time  may  not 
show  the  same  pattern  of  variation  as  when 
observation  is  made  on  many  individuals  at  one 
time.  Cattell  (5)  has  distinguished  factors 
found  from  the  two  types  of  analysis  by  calling 
an  underlying  characteristic  varying  over  indi¬ 
viduals  as  a  trait  and  an  underlying  character¬ 
istic  varying  over  time  as  a  state.  In  .some 
cases  the  former  may  give  more  relevant 
measures.  In  many  instances  it  may  be  more 
important  to  find  an  individual’s  time  pattern 
of  a  measurement  .specified  in  advance. 

The  use  of  factor  analytic  methods  to  estab¬ 
lish  the  measurement  of  an  underlying  charac¬ 
teristic  is  only  a  preliminary  step  in  a  time 
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series  analysis;  it  simply  determines  a  means 
of  converting  a  time  series  of  multiple  scores  on 
certain  instruments  into  a  time  series  of  quan¬ 
tities  considered  more  fundamental.  The 
development  of  these  quantities  in  time  and 
their  relation  to  other  variables  is  then  the 
subject  of  further  inquiry. 

Exploration 

The  term  “factor  analysis”  is  often  used  to 
mean  any  kind  of  analysis  of  a  large  set  of 
variables  which  leads  to  the  selection  or  forma¬ 
tion  of  a  smaller  set  of  variables  for  more 
intensive  study.  This  is  particularly  true  of 
statisticians  not  well  acquainted  with  statistics 
as  used  in  psychology.  In  this  loose  sense, 
factor  analysis  does  not  necessarily  mean  a 
procedure  based  on  the  model  of  section  4.  In 
particular,  principal  component  analysis  can  be 
viewed  as  a  method  of  exploration  (see  refer¬ 
ence  1,  chapter  11). 

In  many  situations  where  a  large  number  of 
measurements  are  made  there  is  little  or  no 
knowledge  of  the  variables  involved  or  of  the 
underlying  mechanism.  In  these  cases  it  may 
be  appropriate  to  use  an  omnibus  statistical 
technic  to  explore  the  data — that  is,  to  find 
the  variables,  or  functions  of  variables,  that 
show  variability,  or  that  are  related.  In  these 
instances,  the  exploratory  study  is  only  a  pre¬ 
liminary  stage  of  investigation;  it  points  the 
way  to  more  basic  study.  In  many  t3rpe8  of 
inquiries  concerning  individuals,  there  is  the 
possibility  of  using  the  results  of  one  study  in 
the  study  of  other  individuals ;  the  psychologist 
can  search  the  data  of  one  individual  for  ap¬ 
parently  relevant  variables  and  then  study 
these  variables  for  other  individuals.  This 
feature  is  in  contrast  to  the  economist  studying 
the  entire  economy  of  the  United  States  and, 
hence,  being  limited  to  one  set  of  data. 

Besides  the  various  forms  of  factor  analysis, 
there  are  other  methods  of  exploration.  If  two 
distinct  sets  of  variables  are  under  considera¬ 
tion,  canonical  correlation  analysis  may  be  more 
appropriate  (see  reference  1,  chapter  12).  If 
some  form  of  factor  analysis  is  used,  care  must 
be  exercised  in  selecting  the  procedure.  There 
are  questions  as  to  whether  correlations  or 


covariances  should  be  used  and  how  communali- 
ties  should  be  treated.  Some  methods  are  based 
on  the  idea  that  factors  account  for  the  depend¬ 
ence  between  observed  variables  as  indicated  in 
section  4.  Other  methods  are  based  on  the  idea 
that  “factors”  account  for  the  variances,  as 
with  principal  component  analysis.  One  method 
may  give  results  quite  different  from  those 
given  by  another  method. 

As  shall  be  noted  in  more  detail  later,  factor 
analysis  does  not  take  into  account  the  time 
relationships  of  the  variables.  If  factor  analy¬ 
sis  or  principal  component  analysis  is  used  for 
exploration,  there  is  a  danger  of  missing  im¬ 
portant  and  interesting  characteristics  which 
are  significant  because  of  their  development  in 
time,  though  not  because  of  their  variability  or 
high  relationship  to  other  variables. 

If  factor  analysis  is  used  in  exploration,  it 
may  be  reasonable  that  it  be  applied  only  to  a 
set  of  data  of  the  same  kind — e.g.,  intelligence 
test  scores,  personality  variables,  or  physiologic 
measurements.  If  the  investigator  wants  to 
study  several  sets  together,  he  can  explore  one 
and  then  another,  or  he  can  use  another  method, 
such  as  canonical  correlations. 

If  it  is  known  that  a  particular  variable  is 
of  special  importance  to  the  investigation,  it 
should  not  be  included  in  an  exploratory  factor 
analysis,  because  this  particular  variable  must 
be  included  in  the  ensuing  time  series  analysis, 
regardless  of  the  outcome  of  the  exploration. 
For  example,  in  Damarin’s  study  of  cancer 
therapy  (7)  the  treatment  variable  has  to  be 
studied,  and,  since  acid  phosphatase  enzyme 
titer  is  a  generally  accepted  index  of  tumor 
growth,  it  has  to  enter  the  complete  analysis, 
too. 

An  exploratory  factor  analysis  is  only  a 
preliminary  phase  of  a  time  series  analysis. 
Once  the  investigator  has  decided  what  the  im¬ 
portant  variables  are,  he  will  want  to  know 
their  interrelations  and  the  mechanism  of  their 
formation. 

Analysis 

Measurement  and  exploration  are  only  in¬ 
troductions  to  understanding  processes  in  time. 
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The  scientist  wants  to  establish  a  mechanism 
(which  may  be  a  mathematical  model)  which 
explains  or  describes  the  generation  of  the 
phenomena  of  interest.  In  considering  charac¬ 
teristics  of  individuals  over  time,  the  investi¬ 
gator  would  like  to  discover  how  one  pha.se  leads 
to  another  or  how  one  sequence  of  variables 
affects  another  sequence.  In  fact,  eventually, 
he  wants  to  be  able  to  .say  how  the  development 
of  the  variables  is  brought  about.  Wold  (14) 
discusses  these  goals  in  terms  of  the  analysis  of 
multiple  economic  time  .series. 

In  the  next  two  .sections,  we  consider  the 
questions  of  the  extent  to  which  factor  analysis 
can  be  used  to  elucidate  the  formation  of  the 
sequence  of  variables  and  the  problems  that 
ari.se  in  this  u.se. 


6.  SOME  PROBLEMS  IN  THE  FACTOR 
ANALYSIS  OF  TIME  SERIES  DATA 

Independence  of  errors 

The  error  terms  Ui,  in  model  3  are  assumed 
to  be  stati.stically  independent;  that  is,  the 
di.screpancie.s  between  the  observations  Xj,  and 
the  systematic  parts  f,,  are  assumed  to  be  un¬ 
correlated.  If  one  were  able  to  make  repeated 
measurements  on  the  entire  matrix,  there 
would  be  no  (population)  correlation  in  the 
repeated  measurements  on  the  pair  (um,  Uj,), 
the  errors  in  two  different  variables  at  one 
time,  or  in  the  repeated  measurements  on  the 
pair  (Uit,  Ui.),  the  errors  in  one  variable  at  two 
different  times,  or  in  the  repeated  measure¬ 
ments  in  the  pair  (u,,,  U|.),  the  errors  in  two 
different  variables  at  two  different  times.  Thi;, 
kind  of  a.ssumption  is  rea.sonable  if  the  discrep¬ 
ancies  are  really  errors  of  measurement,  such 
as  might  occur  if  one  made  physically  inde¬ 
pendent  measurements  with  a  tape  measure.  In 
the  situations  we  are  considering,  however,  it 
.seems  likely  that  the  di.screpancies  would  in¬ 
clude  irregularities  due  to  a  multitude  of 
influences  and  effects  that  are  not  taken  ex¬ 
plicitly  into  account.  Some  of  these,  the  daily 
weather,  for  example,  might  tend  to  correlate 
(U|,,  Uj,)  by  acting  similarly  on  .several  variables 
from  day  to  day.  The  effect  might  be,  however. 


to  correlate  (Ui„  Ui«)  because  the  effect  is 
spread  over  .several  days.  The  day-to-day  cor¬ 
relation  may  be  of  no  greater  di.sadvantage 
than  making  the  information  in  the  .series  less 
than  if  the  ob.servations  were  independent,  but 
the  variable-to-variable  correlation  may  be 
serious  enough  to  bias  critically  the  analysis*, 
because  the  entire  principle  of  factor  analysis 
is  that  only  the  common  factors  tie  the  observed 
variables  together. 

The  model  is  set  up  with  interest  centered  in 
the  .systematic  parts,  the  random  irregularities 
being  added  to  them.  In  many  cases,  however, 
it  may  be  more  realistic  to  think  that  the 
random  irregularities  are  ab.sorbed  into  the 
quantity  of  interest  and,  hence,  affect  the  ob- 
.servations  at  a  later  time.  This  point  will  be 
clarified  in  section  9  by  a  different  kind  of 
model.  The  usual  assumption  that  f,,  and  fj„, 
t  s,  are  uncorrelated  would  not  be  fulfilled 
in  time  series  data,  but  this  effect  would  prob¬ 
ably  not  be  .serious. 

An  effect  of  time  that  is  more  peculiar  to 
psychologic  studies  is  the  effect  of  repeated 
u-ses  of  the  .same  in.struments.  The  individual 
learns  to  do  certain  tasks,  for  instance,  if  he  is 
asked  to  do  them  again  and  again.  This  chal¬ 
lenge  to  the  psychologist  ari.ses,  of  course, 
becau.se  of  the  time  aspect,  not  becau.se  of  the 
use  of  factor  analysis. 

Symmetric  treatment  of  variables 

The  method  of  factor  analysis  treats  all 
ob.served  variables  symmetrically;  this  feature 
appears  even  in  the  statement  of  the  model. 
As  indicated  in  .section  3,  however,  the  ob¬ 
.served  variables  can  usually  be  divided  into  two 
classes — tho.se  which  the  .system  is  to  “explain” 
and  tho.se  which  are  to  be  taken  as  given.  Al¬ 
though  there  are  some  difficulties  of  classifica¬ 
tion,  the  distinction  is  meaningful.  The  latter 
set  may  affect  the  former,  but  not  vice  versa. 
In  the  study  of  Mefferd  et  al.  (13),  for  exam¬ 
ple,  it  is  possible  that  sunspots  affect  the 
patient’s  iierformaiice  on  psychologic  tests,  but 
the  reverse  does  not  .seem  likely.  A  deficiency 
of  factor  analysis  here  is  that  it  does  not  take 
into  account  the  different  .statu.ses  or  functioms 
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of  different  variables ;  in  particular,  a  direction 
of  possible  causation  is  missin}?- 

Another  classification  of  variables  is  ac¬ 
cording  to  whether  they  are  measured  with  or 
without  actual  error.  For  example,  a  .synthetic 
variable  has  no  irregular  component,  whereas  a 
psycholo},Mc  test  may  have  considerable  error. 
It  would  seem  more  realistic  in  the  model  to  .set 
the  formal  “errors”  Ui,  -  0  for  variables  with 
no  irregularity.  The  method  of  factor  analysis, 
as  customarily  used,  does  not  allow  for  this  dif¬ 
ference.  The  anomaly  is  that  a  variable  which 
has  no  real  error  shows  up  with  a  formal 
“error.”  How  can  one  interpret  an  error  in 
time  (Xi,  -  t)  or  a  dichotomous  variable  (X|,  — 
0  or  1 )  indicating  absence  or  pre.sence  of  treat¬ 
ment  ?  This  situation  occurs,  for  e.xample,  in  the 
study  by  Mefferd  et  al.  (13).  The  discrepancy 
between  such  a  variable  as  ob.served  and  as  it 
enters  mathematically  into  the  fitted  model  is 
properly  considered  as  an  error  in  the  model, 
but  such  an  error  does  not  have  the  stati.stical 
properties  assumed  by  the  factor  analysis 
model. 

The  anomaly  of  obtaining  a  formal  error  Uu 
for  a  variable  that  has  no  error  of  measurement 
and  cannot  be  considered  as  having  a  distribu¬ 
tion  can  be  avoided  by  .setting  Un  =-  0  for  the 
variable  and  modifying  the  stati.stical  procedure 
appropriately.  This  matter  will  be  di.scu.s.sed  in 
section  7.  E.s.sentially  these  restrictions  force 
the  u.se  of  regression  procedures  for  .such 
variables  without  distributions  rather  than  the 
correlational  procedures  of  factor  analysis. 

Besides  the  two  classifications  of  the  varia¬ 
bles  mentioned  above,  there  may  be  natural 
grouping  of  variables.  For  example,  the  study 
by  Mefferd  et  al.  (13)  includes  scores  on  p.sy- 
chologic  te.sta  and  constituents  of  urine.  The.se 
are  different,  at  least  in  a  formal  .sen.se.  The 
biochemical  measurements  are  made  in  physical 
units  such  as  density  or  percentage,  while  the 
units  of  psychologic  .scores  may  be  quite  arbi¬ 
trary.  Measurements  made  in  the  .same  units 
are  directly  comparable  and  permit  statistical 
technics  that  take  into  account  the  origins  and 
units  of  measurement. 


Interpretation  of  factors 

The  factor  analysis  yields  the  representation 

(T)) 

-  \a  ''at  t  i 

a  1 

where  now  the  A’'  ,  and  f"  are  estimates 

let  .  nt 

ba.sed  on  the  ob.served  data  and  u^  is  the 

It 

residual.  “Rotation”  to  simple  structure  may 
have  been  done  in  hopes  of  giving  the  factors 
an  interpretation.  Do  the.se  factors  have  any 
meaning  other  than  what  is  given  by  the  formal 
.stati.stical  technic  applied?  It  is  desired  that 
these  f*  represent  some  fundamental  (pianti- 
ties  just  as  the  factors  are  considered  as 
primary  intellectual  abilities  in  the  factor 
analysis  of  intelligence  test  scores. 

The  loadings  for  a  given  factor  may  be  such 
that  the  factor  relates  (a)  only  to  the  observed 
individual  characteristics,  (b)  only  to  the  en¬ 
vironmental  chaiacteri.stics,  or  (c)  to  both  the 
individual  and  the  environmental  characteri.s- 
tics.  In  the  first  ca.se,  the  factor  would  be 
interpreted  as  a  quantity  fundamental  to  the 
individual  and  in  the  -second  case  to  the  envi¬ 
ronment.  In  the  la.st  ca.se,  is  the  factor  con¬ 
sidered  as  fundamental  in  both  kinds  of 
variables?  In  some  instances  this  may  be  .sati.s- 
factory;  in  others  it  may  not.  Mefferd  et  al. 
(13),  for  example,  found  that  factor  I  (in 
Ca.se  D-1)  has  high  loadings  for  sun-spot  activ¬ 
ity  and  for  some  biochemical  measurements. 
This  fact  may  imply  that  .sunspot  activity  is 
correlated  with  biochemical  measurements,  but 
it  does  not  yield  any  more  basic  explanation. 

When  all  pairs  of  factors  f,,  and  f|,  (i  /  j) 
are  uncorrelated  as  assumed  in  .section  4,  they 
are  called  orfkofiovaf.  If  these  conditions  of 
being  uncorrelated  are  not  imposed,  the  factors 
are  called  ohiiqiic.  In  the  case  of  orthogonal 
factors,  alternative  seta  are  related  by  orthog¬ 
onal  tran.sformationa;  alternative  -sets  of 
oblique  factors  are  linear  transformations  of 
each  other.  In  either  ca.se  such  a  transforma¬ 
tion  is  called  a  mtufioH.  (The  u.se  of  the  term 
rotation  for  all  nonsingular  linear  tran.sforma¬ 
tiona  is  an  undesirable  conflict  with  mathemati¬ 
cal  and  everyday  u.sage  of  the  term  rotation.) 
In  the  case  of  an  orthogonal  factor  system. 
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factors  that  relate  only  to  environment  are 
really  irrelevant,  because  these  factors  would 
necessarily  be  uncorrelated  with  the  individual 
characteristics  upon  which  the  study  is  pre¬ 
sumably  focused.  If  the  orthogonal  factor 
system  consists  of  some  factors  involving  only 
individual  characteristics  and  other  factors 
involving  only  environmental  characteristics, 
the  conclusion  is  that  these  two  .sets  of  variables 
are  independent.  As  will  be  pointed  out  in 
more  detail  in  section  9,  however,  factor  analy¬ 
sis  is  not  needed  for  reaching  such  a  conclusion 
and  is  not  the  most  efficient  statistical  proce¬ 
dure  for  ascertaining  independence.  If  the  two 
sets  are  independent,  then  the  set  of  environ¬ 
mental  variables  can  be  dropped  from  the  study 
because  they  do  not  affect  the  .set  of  individual 
variables  in  which  the  investigator  is  primarily 
interested. 

Another  possibility  to  consider  is  that  of  an 
oblique  factor  system  with  some  factors  relat¬ 
ing  to  individual  characteristics  and  some  re¬ 
lating  to  environment.  The  correlations  between 
the  factors  of  two  types  will  connect  the  two 
sets  of  observed  variables.  The  pattern  of 
factor  correlations  may  give  the  investigator 
insight  into  the  relationship  between  individual 
and  environmental  characteristics.  If  the  in¬ 
vestigator  analyzes  these  first-order  factors 
into  a  set  of  second-order  factors,  the  problems 
all  ari.se  again  at  the  level  of  second-order 
factors. 

The  consideration  of  factors  in  studies  in¬ 
volving  two  or  more  different  kinds  of  variables 
sugge.sts  that  the  determination  of  the  best 
factor  system  might  take  this  feature  into  ac¬ 
count.  “Rotation  to  simple  structure”  involves 
transforming  a  .set  of  estimated  loadings  \* 

•a 

by  a  transformation 

III  'Snf 

- 

a  I 

say,  and  simultaneously  f*  by 

at 

j:  a3o 

a  I 

.say,  where  the  matrix  (a^®)  is  inverse  to  (aa^) ; 
the  matrix  (aa^j)  is  chosen  to  make  as  many  \** 
as  possible  equal  to  zero,  or  at  lea.st  near  zero. 


(There  are  some  further  requirements  for  sim¬ 
ple  structure  which  we  can  neglect  for  purpo.ses 
of  the  present  discussion.)  When  there  are 
two  different  sets  of  variables,  the  investigator 
might  obtain  a  more  meaningful  interpretation 
of  the  factors  if  for  some  />’’s  he  tried  to  make 

clo.se  to  zero  for  the  i  corresponding  to  one 
.set  of  variables  and  for  other  /y’s  he  tried  to 
make  A**  close  to  zero  for  i  corresponding  to 
the  other  .set. 

The  time  sequence  effect 

As  noted  in  section  4,  any  method  of  factor 
analysis  starts  with  the  mean  x,  iiX,,  T  of 
each  variable,  the  variances 

Sii  x,)-  (T  t), 

and  the  covariances 

-  x,)(x„  X|)/(T  -  1): 

.sometimes  this  information  is  reduced  to  the 
correlations 

'■i,i  -  Sii  / 

The  inferences  about  the  factor  loadings  Aia, 
that  is,  the  determination  of  m  and  the  e.stima- 
tion  of  the  loading,  is  done  on  the  basis  of  the 
covariance  matrix  (su)  or  the  correlation 
matrix  (ry).  These  sums  involve  products 
of  variables  at  the  same  time  (X|,  and 
Xji)  but  not  at  different  times,  x,,  and  X| ,  i,  for 
example.  Thus,  the  timing  of  the  variables  is 
lost,  and  the  relations  between  the  variables  at 
different  times  are  absent  from  the  summariza¬ 
tion.  To  put  it  another  way,  given  the  variables 
Xi,  one  could  renumber  the  index  t,  and  the 
analysis  on  the  newly  ordered  variables  would 
be  the  .same.  However,  the  significant  feature 
of  time  in  time  .series  analysis  is  the  ordering. 

The  effect  of  time  may  come  into  this  analy¬ 
sis  indirectly  in  the  sense  that  .some  of  the  x,, 
are  clo.sely  and  obviously  related  to  time.  The 
extreme  ca.ses  occur  when  .some  x,,  are 
mathematical  functions  of  time  such  as  t 
it.self  or  a  periodic  function  (a  cycling  variable). 
Thus,  the  timing  of  two  individual  characteris¬ 
tics  may  be  seen  in  part  and  indirectly  in  their 
relationships  to  t  or  other  functions  of  t. 
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The  estimated  factor  loadings  may  be  used 
to  estimate  the  factor  scores  f,,  (as  mentioned 
in  section  5).  These  estimated  factor  scores 
form  a  multiple  time  series  which  shows  the 
time  sequence  and  may  be  submitted  to  further 
analysis. 

The  goal  of  analysis 

If  a  factor  analysis  of  some  time  series  data 
has  been  done  and  the  “rotation  to  simple  .struc¬ 
ture”  has  yielded  factors  that  seem  to  have 
sensible  interpretation,  would  this  complete 
the  analysis  of  the  time  series?  On  the  basis  of 
the  factor  analysis,  one  “explains”  the  ob¬ 
served  variables  by  factors,  which  vary  through 
time.  These  factors  in  themselves,  however, 
are  not  necessarily  ultimate.  In  most  cases  the 
investigator  will  not  be  satisfied  until  he  has 
explained,  or  at  least  described,  the  structure 
of  the  movement  of  the  factors  over  time. 

In  time  aeries  analysis  we  first  want  a 
succinct  statistical  description  of  the  variation, 
relations,  and  development  of  the  relevant  time 
series  data ;  we  want  to  distinguish  the  irregu¬ 
lar  or  random  manifestations  from  the  system¬ 
atic  relations.  (The  random  effects  do  not 
necessarily  need  to  be  simply  independent,  addi¬ 
tive  effects.) 

Secondly,  we  would  like  to  represent  the  reg¬ 
ularities  of  the  data  by  as  simple  a  model  as 
possible  to  yield  meaningful  interpretations  in 
the  particular  discipline.  The  desired  objective 
is  a  statistical  or  probabilistic  model  of  the 
generation  of  the  aeries  that  is  basic  enough 
so  that  (at  lea.st  for  a  time)  it  may  be  consid¬ 
ered  as  an  “explanation”  of  such  kinds  of  series. 

In  the  case  of  a  pendulum  the  mathematical 
model  is  a  trigonometric  function  (a  sine  or 
cosine  of  a  multiple  of  t)  or  a  second-order 
differential  equation;  this  trigonometric  func¬ 
tion  describes  the  observed  phenomena,  and  the 
differential  equation  with  the  notion  of  gravity 
is  an  explanation.  There  may  be  aeries  of  psy¬ 
chologic  measurements  that  have  a  similar 
analysis;  the  phenomena  are  roughly  periodic 
but  with  random  disturbances.  For  instance, 
a  person’s  mood  (elation  or  depression)  may  be 


of  this  kind,  and  its  fluctuations  in  time  may 
be  described  simply — for  example,  as  a  second- 
order  stochastic  difference  equation  (section  9). 

In  other  cases,  a  somewhat  similar  but  more 
complicated  analysis  might  be  possible.  By 
way  of  illustration,  suppose  that  Mefferd  et  al. 
(13)  had  observed  the  following  phenomena: 
One  day  a  shock  treatment  was  given  a  patient 
(say,  Zr,  —  1  for  treatment  and  Zi,  =  0  for  no 
treatment) ;  the  next  day  some  biochemical 
measurement  of  the  patient  (y,,)  increased ;  the 
following  day  a  measure  of  the  patient’s  anxiety 
(yo,)  diminished;  and  on  the  next  day,  perform¬ 
ance  on  a  psychologic  test  (y:i,)  went  up.  Such 
a  sequence  of  events  would  have  the  simple 
interpretation  that  the  shock  affects  the  body 
chemistry,  which  in  turn  affects  the  emotional 
state,  which  in  turn  affects  mental  perform¬ 
ance.  Given  the  multiple  time  series  (yu,  y^t, 
y.ii.  Zii),  the  investigator  would  like  a  method 
of  analysis  that  would  discover  and  emphasize 
the  time  relations  mentioned  above.  Further, 
he  would  like  a  simple  statistical  model  to 
describe  these  relationships.  The  interpreta¬ 
tion  of  such  a  model  would  be  in  terms  of  the 
physiologic  and  psychologic  mechanisms. 

7.  SOME  PROBLEMS  OF  MODIFYING 
FACTOR  ANALYSIS  TO  APPLY  TO 
TIME  SERIES 

Observed  variables  as  factors 

In  many  time  series  studies,  functions  of 
time  are  included,  as  in  the  two  substantive 
studies  described  in  section  2.  These  functions 
may  be  powers  of  t  (yielding  polynomials)  and 
cyclical  functions  (often  expres.sed  as  sines  and 
cosines) .  Such  variables  have  no  real  errors  of 
measurement,  and  it  is  hard  to  give  any  mean¬ 
ing  to  the  assignment  of  a  .specific  factor  to 
such  a  variable  (unless  the  variable  it.self  is  a 
specific  factor ;  that  is,  uncorrelated  with  all  the 
other  variables).  Accordingly,  the  model  de¬ 
fined  by  equations  3  and  4  should  be  corre¬ 
spondingly  restricted.  For  each  i  that  denotes 
such  a  variable  (measured  without  error  and 
not  decomposable  into  a  systematic  part  and 
specific  factor)  the  formal  error  term  u,,  should 
be  set  equal  to  0.  Equivalently,  the  variance  of 
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such  a  term  is  set  equal  to  0  (that  is,  in  corre¬ 
lation  terms  the  “communalities”  are  1).  Let 
us  see  the  implications  of  such  restrictions. 


For  illustrative  purposes,  suppose  there  is 
one  variable,  say  X|,,,  without  formal  error,  and 
set  Ui„  =  0.  Then 

(6) 

hi 

^lit  —  M|j  “I"  21  X  hflfat* 

a  I 

Solving  equation  6  for  f„,t,  we  obtain 

(7) 

1  m  —  1 

t|nt  =  —  (*lit  ~  Ml,  —  2  X,,^  fat)! 

a  1 


that  is,  one  factor  is  represented  by  a  linear 
combination  of  the  error-free,  observed  variable 
and  the  other  factors.  Put  this  expression  for 
f,„,  in  equation  4  to  obtain 

(8) 


tit  = 


\  K 

im  ha 


X 


Inn 


) 


Nn. 

+ - x,„,  i  =  l . h-1. 

X, 


This  is  a  modified  model  for  the  other  h  —  1 
observed  variables ;  the  systematic  part  consists 
of  the  constant  plus  a  linear  combination  of  the 
m  —  1  factors  plus  a  multiple  of  the  error-free, 
observed  variable.  This  last  variable,  Xi„, 
enters  the  equations  as  an  independent  regres¬ 
sion  variable. 


What  is  the  appropriate  statistical  analysis 
for  this  model?  For  the  sake  of  statistical 
analysis,  we  can  take  a  convenient  specification 
of  the  “rotation”  (to  eliminate  some  indetermi¬ 
nacy  in  the  Aia  and  fat)  in  finding  the  estimates 
of  the  factor  loadings.  The  estimated  factor 
system  can  then  be  transformed  to  another 
system  to  facilitate  interpretation.  We  shall 
take  f„„  to  be  orthogonal  to  the  other  factors 
(or  even  take  all  the  factors  to  be  orthogonal) 
and  take  Ai,a  =  0  for  a  =  1,  .  .  . ,  m=l.  FVom 
any  set  of  Aia  there  is  a  rotation  to  a  new  set 
satisfying  these  restrictions.  Then  Xht  =  /*h  + 
Ai,„,f„„ ;  the  last  factor  is  simply  the  last  variable 
with  a  suitable  origin  and  unit.  In  equation  8 


X|„  is  therefore  orthogonal  to  fa,,  a  1,  .  .  .  , 
m  —  1.  Thus,  fit - x,„  has  a  factor  struc- 

Alin, 

ture  with  m  —  1  factors.  If  a,,,,  Ai„„  were 
known,  a  factor  analysis  method  could  be  ap¬ 
plied  to  Xi,  —  X|„.  Not  knowing  these 

Alim 

parameters,  one  estimates  them  by  taking  the 
(sample)  regression  of  x,,  on  X|„,  for  equation  8 
indicates  the  usual  regression  situation  in 
which  X|„  is  orthogonal  to  the  (unknown)  fa,. 
The  estimation  of  the  regression  coefficient  of 
x,„  is  then  done  directly. 

In  general,  the  procedure  is  as  follows :  Take 
the  ordinary  regression  of  the  observed  varia¬ 
bles  with  error  on  the  error-free  variables 
(particularly  trend  variables).  Submit  the 
residuals  of  the  observed  variables  with  error 
from  their  regression  to  a  factor  analysis  pro¬ 
cedure.  The  error-free  variables  are  factors, 
at  lea.st  formally,  and  alternative  factor  systems 
consist  of  linear  combinations  of  the  error-free 
variables  with  the  factors  resulting  from  the 
factor  analysis  of  the  residuals. 

One  alternative  procedure  is  to  submit  the 
original  set  of  h  variables  to  a  factor  analysis 
without  imposing  the  restrictions  that  certain 
U|,  are  0  and  then  rotating  the  factor  structure 
obtained  so  that  some  of  the  estimated  factors 
are  approximately  the  error-free  variables.  In 
each  of  the  two  investigations  reported  in  sec¬ 
tion  2,  the  rotation  to  simple  structure  gave  a 
factor  that  had  a  high  loading  for  the  time 
variable — that  is,  t — the  factor  being  inter¬ 
preted  as  the  time  variable;  in  the  study  by 
Mefferd  et  al.  (13),  another  factor  had  a  high 
loading  for  the  cyclic  variable  of  period  two — 
namely,  (-1)*.  From  the  point  of  view  of 
statistical  inference,  whatever  factor  analysis 
method  is  used  in  this  procedure  is  inefficient, 
because  it  is  based  on  the  allowance  of  errors 
u,t  which  could  have  been  assumed  to  be  0.  The 
interpretation  of  the  factors  requires  the  in¬ 
congruity  of  the  decomposition  of  time  or  other 
trend  variables  into  two  parts. 

Another  alternative  is  to  impose  the  restric¬ 
tions  that  certain  u,,  are  0  and  to  apply  factor 
analysis  under  these  conditions.  In  this  case. 
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the  efficiency  of  statistical  estimation  would 
depend  on  the  efficiency  of  the  factor  analysis 
method.  The  centroid  method  of  factoring 
would  not  be  as  efficient  as  regression  analysis ; 
on  the  other  hand,  the  maximum  likelihood 
method,  when  conditions  are  imposed  and  the 
distributions  are  normal,  would  be  equivalent 
to  taking  the  regression  and  applying  maximum 
likelihood  to  the  factor  analysis  of  residuals. 

Generally  speaking,  the  more  that  regression 
analysis  can  be  used  before  factor  analysis,  the 
better.  Methods  of  regression  analysis  are 
simple  and  efficient.  The  variables  are  explicit 
as  opposed  to  factors  which  are  implicitly  or 
indirectly  obtained.  The  statistical  methods 
are  highly  developed ;  for  instance,  one  can  test 
the  hypothesis  that  a  regression  coefficient  is 
0,  which  is  equivalent  to  the  hypothesis  that  an 
observed  variable  is  independent  of  the  time 
variable,  or  he  can  give  a  confidence  interval 
for  a  coefficient. 

The  factor  analysis  model  defined  by  equa¬ 
tions  3  and  4  is  formally  a  model  indicating  the 
regression  of  the  observed  variables  on  the 
factors.  The  difference  between  regression 
analysis  and  factor  analysis  is  that  in  the 
former  the  affecting  variables  are  known  fac¬ 
tors  instead  of  unknown.  It  seems  clear  that 
use  of  this  knowledge  must  make  regression 
analysis  much  more  efficient.  These  proper¬ 
ties  are  reflected  by  the  fact  that  regression 
procedures  are  not  based  on  an  assumption  that 
errors  in  different  variables  are  uncorrelated. 

Now  let  us  turn  to  the  question  of  interpre¬ 
tation  when  .some  factors  are  identical  with 
explicit  variables  (trend  variables,  or  individual 
or  environmental  characteristics) ,  or  when  they 
are  approximately  the  same.  If  these  are  the 
only  factors,  then  we  have  “explained”  our 
ob.servations  by  the  effect  of  these  explicit 
variables.  The  resulting  statistical  model  is 
that  some  ob.served  variables  are  linear  combi¬ 
nations  of  other  observed  variables  plus  errors. 
This  is  the  usual  regression  model  with  the 
possible  difference  that,  after  the  factor  analy¬ 
sis,  the  errors  in  different  variables  are  un¬ 
correlated.  For  this  model,  however,  regression 
technics  are  more  powerful  than  factor  analysis 
technics.  One  advantage  of  factor  analysis  is 


that  its  usual  procedures  can  lead  directly  to 
establishing  this  type  of  model;  whereas,  this 
model  would  be  more  difficult  to  determine  by 
regression  methods  if  possible  explanatory 
valuables  were  not  known  in  advance. 

If  these  observed  variables,  which  are  ap¬ 
proximately  the  .same  as  factors,  are  genuinely 
environmental,  the  investigator  should  be  .sat¬ 
isfied  with  the  meaning  of  the  model.  If  some 
of  the  variables  are  dummy  or  synthetic  varia¬ 
bles,  the  interpretation  is  less  direct  and  the 
model  is  more  descriptive  and  less  explanatory. 

In  the  above  case,  the  factors  can  be  taken 
as  oblique  to  coincide  as  well  as  possible  with 
observed  variables.  In  case  some  factors  are 
approximately  observed  variables  and  some  are 
not,  there  is  a  question  whether  the  latter  set 
of  factors  should  be  taken  orthogonal  to  the 
former.  If  so,  the  second  set  of  factors 
“explains”  what  is  not  “explained”  by  the 
observed  variables-factors.  This  approach 
has  the  advantage  of  exploiting  to  the  fullest 
the  explicit  variables.  If  orthogonality  is 
not  enforced,  one  has  the  problem  of  inter¬ 
preting  the  relation  between  the  ob.served 
variables-factors  (at  the  level  of  direct  observa¬ 
tion)  and  the  other  factors  (at  a  level  of  being 
inferred).  In  ordinary  regression  analysis 
there  is  difficulty  in  interpretation  when  the 
“independent”  variables  are  correlated.  For 
instance,  in  the  ca.se  of  two  such  variables  the 
regression  function  can  be  written  in  terms  of 
the  two  variables  as  given,  or  in  terms  of  the 
first  variable  and  the  part  of  the  .second  orthog¬ 
onal  to  the  first,  or  in  terms  of  the  .second 
and  the  part  of  the  first  orthogonal  to  the 
second. 

Selection  of  lagged  correlations 

Straightforward  factor  analysis  takes  into 
account  only  simultaneous  relations  between 
variables  since  it  is  based  on  the  matrix  of 
variances  and  covariances  S|j.  Other  summari¬ 
zation  statistics  that  might  be  considered  are 
the  lagged  covariances 

(9) 

1  I  s  ^ 

(Xj,  -  XjXXj ,  ,  ^  -  Xj),  s  ±  1,  ±  2, - 

Til  ' 

There  are  some  variations  of  the  definition 
involving  the  range  of  summation — that  is. 
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“end  effects”  and  dividing  by  the  number  of 
terms  summed  instead  of  by  T.  These  can  be 
converted  into  lag  or  serial  correlations.  To^ 
take  into  account  relations  between  two  varia¬ 
bles  at  different  times,  it  has  been  suggested 
that  for  each  pair  of  variables  the  covariances 
or  correlations  with  different  lags  be  considered 
and  that  the  numerically  largest  one  be 
selected ;  the  factor  analysis  is  then  carried 
out  on  the  h  x  h  matrix  consisting  of  these 
selected  lagged  covariances,  or  correlations 
(reference  4,  p.  678,  for  example). 

This  procedure  takes  some  account  of  the 
timing  of  variation  of  the  several  variables  to 
a  limited  extent.  Only  the  relationship  between 
two  variables  at  one  interval  of  time  is  used, 
while  the  relationship  may  exist  over  .several 
units  of  time.  The  mo.st  .striking  aspect  of  this 
limitation  is  that  a  variable  is  not  even  related 
to  its  own  earlier  values.  In  simple  cases  of 
time  series  analysis  (.such  as  the  pendulum  or 
perhaps  fluctuations  in  mood)  a  sensible  and 
u.seful  analysis  can  be  made  of  a  single  series 
where  lagged  relationships  are  .studied.  In 
multiple  time  .series  analysis  considering  the 
effect  of  some  variable  on  another,  the  effect 
may  depend  on  the  rising  or  falling  of  the 
second:  that  is,  the  effect  may  depend  on  the 
rate  of  change,  but  this  involves  the  .second 
variable  at  two  relative  times.  An  effect  on  a 
variable  that  involves  the  physiology  of  an 
individual  may  take  several  days  to  wear  it.self 
out.  Relating  two  variables  at  only  one  interval 
of  time  may  neglect  considerable  important 
information. 

A  .second  difficulty  in  this  procedure  is  that 
of  interpreting  the  results.  The  model  .speci¬ 
fied  in  equations  .‘1  and  4  is  no  longer  applicable, 
and  in  many  ca.ses  it  cannot  be  simply  modified 
to  be  applicable.  As  an  example,  consider  three 
variables.  The  model 

(10) 
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might  be  applicable  if  the  maximum  lag  corre¬ 
lation  for  the  first  two  variables  is  between 


—  Aiifi,,-i  f-  u,^,  ,  and  Xg,  (both  hav¬ 
ing  f|  ,  ,),  for  the  first  and  third  is  between 
X],!  -j  -  A,if,^,  o  +  U|,,  and  x,„,  and  for 

the  last  two  is  between  Xj^,  ,  and  x.„.  Such  a 
model,  however,  will  not  work  if  the  maximum 
lag  correlations  turn  out  to  be  between  x,,  and 
Xji,  X,  ,  ,  and  x.i,,  and  Xj^,  :i  and  X;i,  becau.se 
the  patterns  of  lags  do  not  fit  together.  The 
fir.st  pair  suggests  that  Xn  and  x-i  involve  f,,: 
the  second  pair  sugge.sts  that  X:,,  involves  f,  ,  , 
since  x,  ,  ,  involves  this;  but  the  third  pair 
suggests  that  X;„  involves  f,  ,  since  Xj  , 
involves  this.  These  involvements  are  incom¬ 
patible.  This  example  simply  illustrates  the 
difficulties  that  ari.se  becau.se  the  proposed  sta¬ 
tistical  procedure  is  not  based  on  a  consistent 
mathematical  model,  explicitly  stated. 

Holtzman  (10)  has  further  criticized  the 
procedure  on  the  grounds  that  searching  a  num¬ 
ber  of  covariances  for  the  largest  may  lead  to 
.spurious  or  at  least  distorted  relations  simply 
becau.se  some  large  covariances  may  turn  up  by 
chance.  It  may  also  be  pointed  out  that  lagged 
covariances  can  have  considerable  sampling 
error. 

8.  MODELS  WITH  SYSTEMATIC  PARTS 
AND  ERRORS 

The  factor  analysis  model  specified  by  equa¬ 
tions  a  and  4  is  mathematically  equivalent  to  a 
model  considered  by  econometricians  and  math¬ 
ematical  stati.sticians.  Equations  4  can  be 
considered  as  defining  an  m-dimensional  hyper¬ 
plane  in  the  (p  ^  q)-dimensional  .space  of  the 
(i,  (one  point  for  each  t).  For  e.xample,  if  there 
are  measurements  and  one  factor,  this  is  a 
line  in  the  usual  .‘{-dimensional  .space.  The 
factor  scores  are  cwirdinates  in  the  m-dimen- 
sional  space.  One  can  take  (p  (  q  m)  linear 
functions  of  the  and  eliminate  the  f„i,  say 

(II I 

-  ’’ri  ^ii  -  ’'i.i  >‘i  !■  —  1,  2,  ,  .  p  q  ni. 

The.se  linear  equations  also  define  the  m-dimen- 
sional  hyperplane.  For  example,  if  p  f  q  ^  3 

and  m  1,  the  factor  analysis  model  includes 
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order  stochastic  difference  equation  (or  “auto¬ 
regressive  scheme”), 


y,  ay,  ,  +  u,. 
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The  estimation  of  the  t,,;  has  sometimes  been 
called  “confluence  analysis”  (5).  Statisticians 
have  done  considerable  research  on  this  problem 
under  such  terms  as  “estimating  structural 
relation.”  Formally,  the  problem  of  estimating 
the  factor  loadings  is  equivalent  to  that  of  esti¬ 
mating  the  coefficients  of  the  structural  rela¬ 
tion  :  that  is,  a  set  of  factor  loadings  determines 
a  set  of  coefficients  (with  indeterminacy)  and 
vice  versa.  However,  there  have  been  different 
points  of  view  in  the  two  lines  of  re.search.  The 
number  of  factors  plus  the  number  of  structural 
equations  is  the  number  of  variables.  Factor 
analysts  look  for  a  small  number  of  factors. 
In  confluence  analysis,  attention  is  given  to  a 
small  number  of  equations,  usually  one.  (A 
small  number  of  equations  can  be  treated  only 
if  further  conditions  are  assumed;  otherwise, 
the  indeterminacy  is  too  great.)  While  the 
formal  equivalence  of  the  two  approaches  is 
occasionally  acknowledged,  little  has  been  done 
to  relate  explicitly  the  multitude  of  results. 

The  formal  difference  between  a  linear  space 
defined  “parametrically"  (model  4),  and  by 
equations  as  in  11  reflects  a  difference  in  inter¬ 
pretation  between  psychologists  and  econo¬ 
mists.  The  latter  interpret  an  equation  as 
describing  the  behavior  of  some  group  in  the 
economy;  for  instance,  one  equation  may  de¬ 
scribe  the  behavior  of  consumers. 

We  shall  not  pursue  this  compari.son  further 
nor  indicate  further  developments  along  this 
line  becau.se  a  different  approach  to  time  .series 
will  be  di.scu.s.sed  in  the  next  .section. 

9.  SOME  GENERAL  METHODS  OF 
TIME  SERIES  ANALYSIS 

Single  time  series 


indicates  that  the  observation  at  time  t  is  made 
up  of  a  part  from  the  last  observation  and  a 
random  disturbance,  where  the  random  di.s- 
turbances  u,  are  uncorrelated.  One  can  think 
of  y,  as  .starting  the  sequence;  then  y-  is 
generated  from  yi  and  u^ ;  and  in  turn  y,  is 
generated  from  y,  i  and  U(.  The  effect  of  the 
time  sequence  comes  into  this  model  by  each 
y,  I  affecting  the  next ;  often  the  parameter  « 
has  a  meaningful  interpretation ;  and  the  ir¬ 
regular  part  u,  is  incorporated  into  the  observed 
variable.  (An  error  of  measurement  could  be 
superimpo.sed.) 


The  second-order  stochastic  difference  equa¬ 
tion. 


(1.5) 


Xf  “iVt  I  -)  “u'Vi  ^  “i- 


is  a  probabilistic  analog  of  the  second-order 
differential  equation.  A  time  .series  generated 
by  this  model  will  show  oscillations  or  fluctua¬ 
tions  of  varying  length,  the  average  length 
depending  on  «,  and  This  is  the  model  for 
the  swinging  pendulum  being  given  random 
pushes  u,  at  regular  intervals.  This  model 
might  be  suitable  for  some  psychologic  char¬ 
acteristic  that  fluctuates,  such  as  mood. 


A  stochastic  difference  equation  model  is 
formally  similar  to  a  regression  model.  For 
example,  model  14  is  similar  to 


where  yt  is  the  “dependent”  variable  and  Zi  is 
the  “independent”  (or  fixed)  variable.  The 
statistical  methods  for  the  difference  equation 
(estimation  of  ot,  etc.)  are  similar  to  the  meth¬ 
ods  of  regression;  in  fact,  the  large  -  .sample 
theory  is  almost  identical.  In  equation  16,  x, 
“explains”  y,,  and  in  14  y,  i  “explains”  y,. 


First,  let  us  review  some  methods  of  uni¬ 
variate  time  series  analysis.  In  many  way.s, 
the  most  .sati.sfactory  analysis  of  a  single  time 
series,  .say  x,,  is  in  terms  of  itself.  A  first- 


Another  kind  of  model  is  the  moving  aver¬ 
age.  In  the  simplest  ca.se. 


y,  a,U,  4  a.U,  ,, 


(17) 


14 


the  Ui  are  unobservable,  uncorrelated  disturb¬ 
ances.  The  two  models  can  be  combined  by 
defining  the  disturbance  of  the  stochastic 
difference  equation  as  a  moving  average.  The 
combined  model  (with  certain  restrictions) 
generates  a  wide  variety  of  stationary  sto¬ 
chastic  processes. 

A  stationary  stochastic  process  is  a  time 
series  in  which  the  statistical  properties  of 
observations  in  one  interval  of  time  are  the 
same  as  in  another  interval  of  time  of  the  same 
length.  More  precisely,  the  distribution  of  y, 
is  the  same  as  that  of  yKl  the  joint  distribution 
of  y,  and  yn  is  the  same  as  that  of  y,  ^ ,  and  yK+r. 
In  particular,  the  population  mean  and  variance 
of  y,  do  not  depend  on  t,  and  the  covariance  (or 
correlation)  between  y,  and  yK  depends  only  on 
t  --  s.  A  succinct  summarization  of  such  a 
time  .series  is  the  sample  mean,  the  sample 
variance,  and  the  set  of  observed  serial  covari¬ 
ances  or  correlations;  the  latter  is  called  the 
correlogrom.  The  periodogram  or  spectral 
function  is  a  mathematical  (Fourier)  transform 
of  the  correlogram. 

Some  stationary  time  series  can  be  repre¬ 
sented  as 

(18) 

III 

y,  =  S  (A^,cos  X^t  +  B^.sin  X^t), 

I.’  I 

where  the  A’s  and  B’s  are  uncorrelated  random 
variables  with  A^.  and  having  mean  0  and 
the  same  variance  and  the  a's  are  constants 
with  0  <  Ai  <  Aj  <  . .  .  <  A.  <  TT.  The  spectral 
distribution  function  F  (a)  is  the  sum  of  the 
variances  of  the  A^.  or  B^.  corresponding  to 
A,;  A.  A  broader  representation  is  obtained 
by  replacing  the  finite  sum  in  equation  18  by  an 
infinite  sum,  and  a  completely  general  repre¬ 
sentation  is  obtained  by  replacing  the  sum  by 
an  integral.  Under  certain  conditions  the  spec¬ 
tral  distribution  function  F  (a)  can  be  written 
as  the  integral  of  a  nonnegative  function 
f  (a),  which  is  called  the  spectral  density. 

The  interpretation  of  the  spectral  analysis 
of  a  time  .series  can  perhaps  best  be  clarified  by 
analogy.  When  light  is  passed  through  a  prism, 
it  is  .spread  out  into  a  spectrum.  If  the  .source 
is  an  incandescent  element,  the  .spectrum  con¬ 


sists  of  a  finite  number  of  bright  lines.  The 
mathematical  representation  of  this  light  is 
equation  18,  where  t  is  now  continuous  time. 
The  brightness  of  the  spectral  line  at  frequency 
Aj  is  a  function  of  Aj-  +  Bj*.  Another  source  of 
light  might  produce  a  smooth  spectrum  with 
some  brightness  at  all  frequencies  in  the  range. 
The  energy  over  an  interval  of  frequencies  can 
be  given  as  a  function  of  the  integral  of  the 
spectral  density  over  this  interval. 

The  statistical  models  presented  here  (1) 
are  univariate,  (2)  are  characterized  by  lagged 
covariances  (or  variances  and  lagged  correla¬ 
tions,  or  equivalently  the  spectral  function), 
and  (3)  “explain”  the  series  internally —  that 
is,  in  terms  of  a  mechanism  involving  this  one 
quantity.  These  models  have  been  studied  ex¬ 
tensively  by  mathematicians,  and  statistical 
procedures  have  been  worked  out  fairly  well, 
at  least  on  the  basis  of  large  samples.  Hannan 
(9)  gives  a  good  summary  of  the  methods  of 
statistical  analysis  at  a  somewhat  sophisticated 
mathematical  level;  an  introduction  requiting 
less  mathematical  background  is  being  pre¬ 
pared  (2). 

The  effect  of  outside  influences  can  be  in¬ 
corporated  into  these  models.  The  first-order 
stochastic  difference  equation  can  include  a 
regression  variable 

(19) 

Xi  ■-  oyi  1  +  +  Uf 

The  endogenous  variable  at  a  given  time  is 
affected  by  the  next  previous  value  of  it  and  by 
an  exogenous  variable.  This  model  can  be  ex¬ 
tended  to  any  number  of  lags  and  any  number 
of  external  variables. 

Several  time  series 

The  univariate  models  can  be  generalized  by 
replacing  the  single  variables  by  vectors.  For 
example,  equation  19  can  be  replaced  by 

(20) 

^  “ijV,,,  I  f  ^  +  “if  '  -  1 . P- 

j  k 

Here  the  Ui,  are  independent  from  one  time  to 
the  next  though  not  neces.sarily  from  one  varia¬ 
ble  to  the  next.  The  system  permits  any 
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endogenous  variables  at  t  --  1  and  any  exog¬ 
enous  variables  to  affect  each  endogenous 
variable  at  time  t.  Some  of  the  coefficients 
aij  and  may  be  0,  indicating  no  effect. 
This  model  can  be  generalized  further  to  allow 
for  interactions  between  variables  at  the  same 
time  as 

(21) 

2  7ijyjt  =  +  Uit.  >  -  1 . P- 

When  All  =  1  and  Aij  =  0,  i  ^  j,  we  obtain 
equation  20.  Of  course,  more  lags  can  be  in¬ 
cluded,  and  Zkt  can  denote  a  lagged,  exogenous 
variable. 

These  models  are  flexible  and  dynamic. 
Exogenous  variables  can  be  included,  and  the 
effect  of  lagged,  endogenous  variables  can  be 
extended  as  far  in  the  past  as  desired;  the 
models  can  generate  fluctuations.  The  models 
are  explicit;  they  indicate  relations  between 
directly  observed  variables.  The  models  are 
usually  easy  to  interpret,  and  the  coefficients 
can  be  given  substantive  meaning. 

The  application  of  these  models  requires 
some  a  priori  knowledge  of  the  phenomena 
being  studied.  Endogenous  variables  must  be 
distinguished  from  exogenous  variables  to  de¬ 
termine  equation  20.  Also,  a  limit  must  be  put 
on  the  number  of  lags  to  be  included.  Of  course, 
the  longer  the  observed  series,  the  more  terms 
can  be  included.  Exclusion  of  a  term  in  an 
equation,  either  yj,t-.  for  specified  j  and  lag  s 
or  Zkt  for  specified  k,  can  be  determined  by 
testing  the  null  hypothesis  that  the  coefficient 
of  that  variable  is  0.  Whether  the  entire  set 
of  y’s  are  independent  of  a  given  z,  or  a  .set  of 
z’s,  can  be  decided  by  testing  whether  the  cor¬ 
responding  jtf’s  are  0. 

Econometricians  have  studied  the.se  models 
considerably  and  have  developed  appropriate 
.statistical  methods.  Unfortunately,  there  is  no 
straightforward,  easily  understandable,  and 
relatively  complete  exposition  of  these  models 
of  multiple  time  series  analysis,  but  a  fairly 
good  introduction  with  an  econometric  slant  is 
available  in  reference  11.  A  more  elementary 


discussion  has  been  given  in  economic  terms  by 
Klein  (12).  Many  papers  have  been  published 
in  the  Amials  of  Mathematical  Statistics,  Econ- 
ometrica,  Biometnka,  and  Journal  of  the  Royal 
Statistical  Society,  Series  B. 

Spectral  analysis  can  also  be  extended  to 
multiple  time  series.  The  generalization  of 
model  18  is 

(22) 

111 

Vi,  =  2:  (A,,,cos  X,.t  +  B,^,8in  X^,), 

where  the  sets  of  random  variables  (Ai^ . 

A|,^.)  and  (Bi, . Bp^.)  are  statistically  inde¬ 

pendent  with  common  covariance  matrix  and 
are  independent  of  other  A’s  and  B’s. 

Time  series  of  psychologic  quantities 

One  aspect  of  psychologic  statistics  that 
differs  substantially  from  others  is  that  a  num¬ 
ber  of  observed  variables  are  used  to  obtain 
indirect  measurements  on  underlying  latent 
variables.  It  is  these  underlying  variables  that 
are  considered  to  be  of  primary  psychologic 
significance.  In  such  a  time  .series  analysis, 
therefore,  there  are  a  large  number  of  observed 
variables.  To  understand  the  process,  a  large 
number  of  certain  kinds  of  variables  must  be 
reduced  to  a  small  number.  Factor  analysis 
may  be  useful  in  this  reduction.  This  feature 
is  similar  to  that  of  economics  where  a  number 
of  variables,  say  prices  of  different  kinds  of 
goods,  are  to  be  combined  into  one  representa¬ 
tive  variable,  a  price  index.  The  aggregated 
variables  are  then  subjected  to  time  series 
analysis.  The  economic  problem  of  “index  num¬ 
bers”  or  “aggregation”  is  different  from  the 
psychologic  problem  of  factors  or  primary  abil¬ 
ities,  becau.se  the  index  number  is  supposed  to 
approximate  the  total  relevant  effect  of  the 
quantities  involved.  On  the  other  hand,  the 
factor  is  considered  to  be  the  common  part  of 
several  observed  quantities.  The  appropriate 
mathematical  models  are  correspondingly  dif¬ 
ferent.  The  similarity  in  approach,  however, 
is  that  the  reduction  of  many  variables  to  a 
small  number  is  to  be  carried  out  before  com¬ 
pleting  a  time  .series  study. 
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